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In this paper we propose weighted symmetric binary matrix factorization (wSBMF) 
framework to detect overlapping communities in bipartite networks, which describe re¬ 
lationships between two types of nodes. Our method improves performance by recog¬ 
nizing the distinction between two types of missing edges—ones among the nodes in 
each node type and the others between two node types. Our method can also explicitly 
assign community membership and distinguish outliers from overlapping nodes, as well 
as incorporating existing knowledge on the network. We propose a generalized partition 
density for bipartite networks as a quality function, which identifies the most appropri¬ 
ate number of communities. The experimental results on both synthetic and real-world 
networks demonstrate the effectiveness of our method. 

Keywords : bipartite network; weighted symmetric binary matrix factorization; partition 
density. 


1. Introduction 

Community structure is a common characteristic of various complex networks found 
in biological, social, and information systems, etc. L i M 3 ! 4 5 I 6 I 7 IH A community is com¬ 
monly defined as a densely interconnected set of nodes that is loosely connected with 
the rest of the network 1 ( Studies have shown that community structures are highly 
relevant to the organization and functions of the network. For instance, communi¬ 
ties in social networks correspond to social circles ® communities in protein-protein 
interaction networks capture functional modules and communities affect the 
spread of behaviors and ideas 12121121 

Although numerous community detection methods have been proposed, rela¬ 
tively few methods are designed for bipartite networks 11 | 12 | 13 | 14 | 15 | 16 | 17 , A bipartite 
network G(A,T,E) contains two disjoint types of nodes, A and T, and the edge 
set E connecting the two parts. There is no edge among vertices in A and among 
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those in T. Many systems can be naturally modelled as bipartite networks l 14 l 18 l For 
instance, a metabolic network can be considered as a bipartite network of reactions 
and metabolites^®. Many unipartite networks are derived from bipartite ones. For 
instance, a scientific collaboration network is derived from an author-paper bipartite 
network^ 20 . A community in a bipartite network G(A, T, E ) can be defined as a set 
of nodes — from both A and F — that are densely interconnected. Bipartite com¬ 
munity detection is not necessarily equivalent to unipartite community detection 
on the projected networks, because the projection often destroys important infor¬ 
mation HnHEH Here we would like to point out the difference between the missing 
edge among A and among T, and that between A and T. Imagine a network of 
people and their affiliations. With complete information about people’s affiliation, 
the absence of edge ( i , j ) (i € A, j € F) means that the person i does not belong to 
the organization j. However, the absence of edge (i,k) ( i , k € A) simply indicates 
that we do not know the direct social relationships between i and k. 

In our previous work we proposed the Symmetric Binary Matrix Factorization 
(SBMF) to detect overlapping communities in unipartite networks and demon¬ 
strated its effectiveness^ 2 . In this paper, we propose weighted Symmetric Binary 
Matrix Factorization model to detect overlapping communities in bipartite net¬ 
works. The model can differentiate between the two kinds of missing edges in the 
bipartite network to improve detecting performance. The model allows us explicitly 
to assign community membership to nodes and distinguish outliers from overlapping 
nodes while providing a way to analyze the strength of membership and incorporate 
existing information. To quantify the goodness of the communities that we found, 
we generalize partition density and use it to select the most appropriate number of 
communities. 


2. Methods 

2.1. Weighted Symmetric Binary Matrix Factorization 

The adjacency matrix of an undirected and unweighted simple graph G with n 
nodes can be defined as: 


A / !» if * ~ 3 

IJ \ 0, if i = j or i oo j , 

where i ~ j means there is an edge and i no j means there is no edge. 

Imagine an unweighted and undirected bipartite network G(A,T, E), which has 
7 Ta and rip nodes in A and T, respectively, and an edge set E connecting the two 
parts. The corresponding adjacency matrix A can be split into four blocks after the 
nAth row and the ?iAth column: 


0 A B 
B t 0 r 
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where Oa and Or are null matrices of size tia x tia and nr x nr, respectively, and 


f 1, if i ~ j, i G A, j G T 
1 0, if i j, i G A, j e F 


The meaning of the zeros in Oa, Or is different from that in B. If B captures all 
existing connections perfectly, then all zeros in B indicate the absence of the corre¬ 
sponding edges. By contrast, the zeros in Oa and Op represent missing information, 
rather than the absence of edges. To use this information, we introduce a weight 
matrix L of size n x n to handle these unobserved or missing values which can 
be defined as: 


_ f 7 if Aij is observed 
IJ \ 0 if Ajj is unobserved, 

where 7 is a nonnegative weight parameter that captures the reliability of A,j . For 
standard bipartite networks, L can be formulated as: 


Oa Ia. r 
Ir.A Or 


where Ia r and Ir, a are matrices where all entries are one, meaning that only the 
zeros in B are considered. The sizes of Ia. r and Ir, a are x nr and nr x tia, 
respectively. 

Our weighted Symmetric Binary Matrix Factorization (wSBMF) model can be 
defined as the following constrained nonlinear programming: 

min ||L o (A - UU T )\\ 1 + £(1 - 0QT U^)) 

1 j (1) 
subject to Ufj — Uij = 0, i = 1, 2 ,..., n, j = 1, 2 ,..., c, 

where o represents element-wise multiplication (Hadamard product); A is the adja¬ 
cency matrix of size n x n (n = n/\ + nr); U is the community membership matrix 
such that Uit = 1 if node i is in the community t, and 0 if otherwise; Note that nu¬ 
merical experiments show that the Frobenius norm on the sparse adjacency matrix 
A often results in the ultra-sparsity of U, even null matrix U, which is not informa¬ 
tive enough for real analysis. We use 1-norm instead to obtain more reasonable and 
explainable matrix U. 1-norm of a matrix X is the largest column sum of abs(A'), 
where abs(A)ij = abs(JQj), and abs(-) is the absolute value; 0 is the Heaviside step 
function such that for some matrix X, 


©PO a 


1 if Xij > 0 ; 
0 if X^ < 0 . 


L chooses which entries of the adjacency matrix should be considered in the opti¬ 
mization and thus allows us to incorporate existing knowledge. For instance, if we 
already know that some edges are present between nodes in A, then we can update 
the corresponding elements of L from zero to 7 . If we want to ignore edges in 5, we 
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can simply update the corresponding element of L from one to zero. We can even 
vary 7 across elements if we can assess the reliability of the incorporated knowledge. 

We initialize U by solving the following weighted Symmetric Nonnegative Matrix 
Factorization model: 

min \\L o (A — UU T )\\p 

subject to Uij ^ 0, i = 1 ,2 ,..., n, j = 1 ,2 ,..., c, (2) 

1 = l,i=l,2,...,n. 

Then we fix U, and discretize the domain {u : 0 ^ u ^ max(t/)} to find u that 
minimizes the following, simpler optimization problem: 

min ||L o (A — Q(U — u)0(U — u) T )||i + 

+E(i -zm-uh) (3) 

* 3 

where u is a scalar. Finally, we obtain the binary matrix U as follows: 


U := Q(U-u). 

To optimize U for model ([2]), we initialize U using the algorithm of alternative 
least squares error developed for NMF l 24 ! 25 ! : 

min \\B - UxU^fp 
Ui,u 2 

subject to J7i ^ 0, t/ 2 ^ 0. 

See Appendix: Algorithm 1. 

Then, based on the boundedness theorem [M 27 EH, we normalize Ui and U 2 to 
balance their scales: 

Ui = UiD^ 2 dI / 2 , U 2 = U2D~ l/2 D\ /2 (5) 


where 

D 1 = diag (max £/j(:, 1), maxC/i(:, 2), • ■ • , max U\{\, c)); 
D 2 = diag (maxt/ 2 (:, l),maxb r 2 (:, 2), ■ • • , max t/ 2 (:, c)); 


and diag(ai, a 2 ,..., a n ) is the diagonal matrix whose diagonal entries starting from 
the upper left corner are ai, a 2 ,..., a n . Ui(:,i) is the itli column of Ui. Finally, 


we merge U\ and U 2 into U such that U 


t/i 

U 2 


, and employ the algorithm of 


multiplicative update rules for model ([2]). See Appendix: Algorithm 2. 


2.2. Model Selection 


We have proposed a modified partition density to select the appropriate number of 
communities HEH The modified partition density is defined as: 


a.—l 


1 

g(«) N 


£>(«), 
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where is the partition density of community a : 


£)(«) 


— rnS°^ 
m(“) — ' 


and = ( nS °— 1), = nS a \n^ — l)/2 are the minimum and maximum 

possible numbers of links between the nodes in the community a, respectively; n 
and m are the number of nodes and the number of edges in the community a, 
respectively; q^ = ma xj ea lj is the maximum number of community memberships 
(lj) among the nodes (j) that belong to the community a; N is the sum of the sizes 
of different communities and the number of outliers. 

Here we generalize it for bipartite networks by transforming each bipartite com¬ 
munity to a unipartite one and getting the corresponding partition density. For a 
community a, we define the subnetwork G^ as the set of nodes in a and the edges 
among them. The subnetwork has n^ nodes in A and n{l' * nodes in F, and the 
corresponding adjacency matrix is 


A< a ) 


0 

£(a)T q 


Then we transform the bipartite subnetwork G l ' a ' 1 to a unipartite subnetwork 
by overlaying the two projections onto A and F. The adjacency matrix be¬ 
comes: 


AW 


B^ a ) T ij(“) T i3(«) ’ 


and the diagonal elements indicate the number of neighbors in the other part that 
the corresponding node has. The values of mS a \mf' a \ and are changed to: 


m (a)' _ Y^ijiA^y - diag(A( a )'))jj/2, 

where diag(A^“^ ) is the diagonal matrix whose diagonal entries are those of A ; 


m(“)' = 


(«)/ (a) 
1 A \ n A 

2 


1) (a) 

-nb ’ 


#°(4 a) 

2 


1) (a) 

-"4 




and 


)(“)' = 


(™A' - !) + (4 “' - !) + (»a' + 


,(a) 





m( a ) — m 
m (ay _ m (aY ’ 


Then D^ becomes: 
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and the generalized partition density is: 


* = £ 

0=1 


1 

q( a ) N 


£)(“)'. 


2.3. An illustrative Example 

We show a small example that illustrates how the method works. Figure [l] exhibits 
a bipartite network with two communities, which can be clearly recovered by our 
approach. Specifically, for c = 2 we have m W = 136, = 114; rrS 1 ^ = 35, = 

35; m*- 1 - 1 = 147, mP' > = 147; q W = 2 ,q^ = 2; and N = 20. Let us illustrate how 
we can incorporate existing knowledge. If we know that Nodes III and IV are in 
the same community, then we can revise A and L such that the elements in the 
positions of (13,14) and (14,13) are 1. The result for events is changed to 

'i 1111 ol T 
000111 ’ 


which group III and IV together. 

Note that the bipartite network can be projected onto the Event part or onto the 
People part. Two events are connected if they have at least one common neighbor 
in the People part, resulting in a complete network containing six nodes. The loss of 
information is obvious and the community structures vanish, which means that the 
problem of community detection in bipartite networks is not reducible to unipartite 
case. 


Events 


People 



B 


People |/ i o\ 

o 
o 

i 
i 


i i 
i i 

o 1 
0 1 
\0 1 / 


Events 


/I 1 1 0 0 0\ 
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0 11111 
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0 
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# communities 

Partition density 

2 

0.40 

3 

0.19 

4 

0.10 

5 

0.07 



Fig. 1. Illustration of wSBMF method. The network consists of events and people and 
exhibits two overlapping groups where some individuals (4-7) belong to both communities. 
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2.4. Possible Extensions 


The wSBMF model can be naturally extended to M-partite networks, whose adja¬ 
cency matrix can be split into M x M blocks: 



< 

< 

O 

^Ai , A 2 

°Ai,A 3 

1 

s 

< 

< 

0 


R T 

U A, ,A 2 

^A2,A2 

^a 2 ,a 3 

^A2 ,Am 


f^A3,Ai 

dT 

^a 2 ,a 3 

0a 3 ,a 3 

0a 3 ,Am 

A = 

Oa M -i,Ai 

^Am-1 ,A2 

s 

< 

1 

s 

< 

0 

•^Am-i,Am 


°Am,A! 

^Am,A2 

dT 

^Am-i ,Am 

®Am,Am 

where Oa,,a is null matrix of size n Ai x n Aj , and 


B Ai}Ai+1 ab- { 0 ’ if 

a ~ b 1 a 
a 00 b 1 a 

£ A.;, b £ Aj_|_i 
£ A.j, b £ A i+ i, i = 

1,2,... ,M 

In this case, L should be reformulated as: 



0Ai, Al 

Iai , A 2 

°Ai,A 3 

OAi ,A m 


Ia 2 ,A! 

0a 2 , A 2 

Ia 2 ,a 3 

°A 2 ,A m 


0a 3 ,A! 

Ia 3 ,a 2 

0a 3 ,a 3 

®A 3 ,A m 

L = 

OAm-lA! 

QAm-IjAs 

^Am-i,Am-i 

IAm- 1 ; A M 


Oam.Aj 

®Am,A2 

^Am,Am-i 

^Am,Am 


where I a a is matrix where all entries are one with size n A . x n A • 


3. Results 

In this section we evaluate the performance of our method using both synthetic and 
real-world networks. 


3.1. Datasets Description 

We first discuss the existing bipartite benchmark networks 11 . The benchmark has 
five communities, each having the same number of nodes. Edges only exist between 
A and T with possibility p in if they are in the same community and p out if otherwise. 
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Often, pi n is set equal to either 0.5 or 0.9 and p ou t is set as apt n , where a varies 
from 0 to 1. With increasing a , the community structure becomes less clear. Here we 
propose two new, more realistic benchmark graphs that exhibit overlaps, variable 
community sizes, and fixed density with different mixing parameters. 


• Non-overlapping communities: This class of networks has four communities 
with the same number of nodes (each with 32 from A and 32 from T). Edges 
exist only between A and T. On average, each node has Zi n + Z out = 16 
edges. In other words, each node in A has Z ln neighbors within its own 
community and Z out ones outside. With decreasing Z out , the community 
structures become clearer. 

• Overlapping communities: This class of networks has c communities and 

the number of nodes in each community can differ from each other. A 
community a contains n ^ nodes and n ones in A and T respectively. 
On average each A node in the community a has z\^ F neighbors in 
its own community and Z^l r neighbors in other communities. Actually, 
since we should have Z-^/n^ = z[™ ^/n^ \ and Z^t/iY^t n r' — = 

VQZt n r^ — n r* ))> a, a' = 1, 2 ,... c, it is enough only to give and 

Z^ t to generate the network. In our setting there are four communities 
containing 32 A nodes and 32 T ones in each community. In addition, there 
are t overlapping A nodes between communities a and a + 1, a = 1, 2, 3. 
and Z^J t are set to 10 and 6, respectively. 

We also use real-world networks for evaluation. 


• Southern women network ® This dataset is the network describing the 
relations between 18 women and 14 social events. Edges only exist between 
the women and the events, which makes the graph bipartite. There are 
89 edges. The network is commonly used as a benchmark for bipartite 
community detection. 

• Senator networlQ This is the network of 110 US senators connected by 
voting records for 696 bills. There is an edge between the senator and 
the bill if the senator voted for the bill. We remove inactive senators who 
abstained from more than thirty percent of the bills and also the inactive 
bills which are waived by more than thirty percent of senators. The final 
dataset contains 96 senators and 690 bills. There are still abstention cases 
in the network, which are considered as missing values and can be handled 
by L. 


http://www. senate, gov/ 
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3.2. Assessment Standards 

Normalized mutual information is used as the standard to evaluate community 
structure detection performance. The value can be formulated as follows ® 


E E "0 ln ' 


Inorm (Mi, M2) — 


i =1 3=1 


\V.' 


( 2 ) 


^E^ln^H Enfln^ 


G=1 


where Mi and M 2 are the true cluster label and the computed cluster label, respec¬ 
tively; c is the community number; n is the number of nodes; is the number 

of nodes in the true cluster i that are assigned to the computed cluster j; n[^ is 

( 2 ) 

the number of nodes in the true cluster i; and n'j ’ is the number of nodes in the 
computed cluster j. The larger the values of NMI, the better the graph partition¬ 
ing results. For overlapping benchmarks we use the generalized normalized mutual 
information 31 . 


3.3. Results 

We compare our method with the BRIM model 11 1, which is the only method that 
we can get the codes, on the synthetic benchmarks. Note the the BRIM method 
cannot handle overlapping communities and missing values in the network. To show 
that the problem of detecting overlapping communities in bipartite networks is not 
trivial and cannot be reduced to the unipartite case, we also compare our method 
with SBMF model^ on the two unipartite networks A and T, where the two nodes 
are connected if they have at least one common neighbor. 

In many real scenarios there is background information available. We can in¬ 
corporate it into the detection process by revising the objective matrix A and the 
weight matrix L to improve the performance of detection and the interpretability of 
the results. Specifically, we consider two types of background information for node 
pairs of the same type (i.e., A or T): (i) existence constraint C e : ( i,j ) G C e 
means that nodes i and j are connected; (ii) absence constraint C a : ( i,j ) G C a 
means that nodes i and j are not connected. 

We only consider incorporating background information on the nodes in A in 
this paper for simplicity. Given a bipartite network with nodes in A, there are 
«a(»a — l)/2 pairs of nodes available. We randomly select five percent of pairs for 
prior information: if the two nodes in one pair have the same community label, we 
assume that they belong to C e , otherwise they belong to C a l 32 l 33 [ The zero matrices 
Op in A and L are revised accordingly: 


0 A 


1, if {i, j) G C t 
0 , otherwise, 


( 6 ) 
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where Oa is the submatrix in A. 

_ J 7, if (h j) e C e or (i, j) e C a 
~ \ 0, otherwise, 

where Oa is the submatrix in L. We set 7 equal to 1. 

The results are shown in Figs. [2] and [3] They show that the wSBMF method 
is much better than SBMF on unipartite networks, indicating the nonreducible 
property of community detection problem in bipartite networks, and it also performs 
better than BRIM in non-overlapping community benchmark graphs. Our method 
can identify reasonable number of communities, and the background information 
can significantly improve the results. We also evaluate the method on the southern 



(b) 


CO 

CD 

E 

D 

£ 

£ 

o 

o 

H— 

0 

1— 

0 

.Q 

E 

z 



'out 


Fig. 2. Performance of BRIM and wSBMF on the bipartite networks, SBMF on 
the monopartite networks, and the number of communities estimated by BRIM and 
wSBMF on non-overlapping networks. We randomly select five percent of pairs in A for 
background information. 


women network and the senator network. Fig.[4]shows the results of partition density 
under different community numbers on the two networks, and the most appropriate 
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Fig. 3. Performance of wSBMF and the number of communities estimated by SBMF 
on overlapping networks. We randomly select five percent of pairs in A for background infor¬ 
mation. 


number is 2 for both of them. For the southern women network, the result is very 
similar to that in 2 ^, where there are two groups in women, women 1 — 9 and 9 — 18. 
For the senator network, the result is consistent with American two-party politics. 
Fig- ! shows the result of community structure on the women network detected by 
wSBMF. We also use exponential entropy e Hi ,i = 1,2, to analyze the 

strength of women’s community memberships, where 

2 

Hi — ^ ^ log Uij ,7 — 1,2,..., 77. A- 

1=1 

The result is given in Fig. [6j 


4. Discussion 

In this paper we have shown how to apply symmetric binary matrix factorization and 
partition density to find communities in bipartite networks. The model is parameter 
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Fig. 4. Averaged partition density of wSBMF versus community number on (a) 
women network and (b) senator network. 
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Fig. 5. Communities detected by wSBMF model in the women network. There are no 
outliers and overlapping nodes. 


> 2 

o. 

o 

c 15 

UJ 



Women Labels 


Fig. 6. Exponential entropy of women. Higher value means fuzzier membership degree. 


free, easy to implement, and flexible enough to incorporate background information. 
Experimental results on both the synthetic and real-world networks demonstrate 
the effectiveness of the proposed method. 

There are two interesting problems for future work: (i) extension of the method 
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to weighted bipartite networks and directed bipartite networks; and (ii) theoretical 
investigation on partition density and algorithm design for its direct optimization. 

Appendix 

Summarization of Algorithm [l] and [2j We set the iteration number Ci equal to 10 
and the iteration number C 2 equal to 100. 


Algorithm 1 Nonnegative Matrix Factorization (Alternative Least Squares Error) 

Require: B 1 Ci 
Ensure: C7i, L /2 

1 : Initialize elements of U\ with nonnegative random numbers drawn from [0,1]. 
2: for t = 1 : Ci do 

3: Solve for U 2 in equation U 1 U 1 U 2 = A 

4: U 2 = max(l/ 2 , 0) 

5: Solve for U 2 in equation U 2 U 2 U 1 = U 2 A T 

6 : U\ = max([/i, 0) 

7: end for 


Algorithm 2 Weighted Symmetric Nonnegative Matrix Factorization (Multiplica¬ 
tive Updates) 

Require: A, U, C 2 
Ensure: U 
1: for t — 1 : C 2 do 


2 : 


U:=U o 


l(LoA)U] 

[Lo(UU T )U] 


3: end for 


E,u, 


Ua 

3 ,i= 1,2, 


' • ,n 
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